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ABSTRACT 

Star-galaxy classification is one of the most fundamental data-processing tasks in 
survey astronomy, and a critical starting point for the scientific exploitation of survey 
data. Star-galaxy classification for bright sources can be done with almost complete 
reliability, but for the numerous sources close to a survey's detection limit each image 
encodes only limited morphological information about the source. In this regime, from 
which many of the new scientific discoveries are likely to come, it is vital to utilise 
all the available information about a source, both from multiple measurements and 
also prior knowledge about the star and galaxy populations. This also makes it clear 
that it is more useful and realistic to provide classification probabilities than decisive 
classifications. All these desiderata can be met by adopting a Bayesian approach to 
star-galaxy classification, and we develop a very general formalism for doing so. An 
immediate implication of applying Bayes's theorem to this problem is that it is formally 
impossible to combine morphological measurements in different bands without using 
colour information as well; however we develop several approximations that disregard 
colour information as much as possible. The resultant scheme is applied to data from 
the UKIRT Infrared Deep Sky Survey (UKIDSS), and tested by comparing the results 
to deep Sloan Digital Sky Survey (SDSS) Stripe 82 measurements of the same sources. 
The Bayesian classification probabilities obtained from the UKIDSS data agree well 
with the deep SDSS classifications both overall (a mismatch rate of 0.022, compared 
to 0.044 for the UKIDSS pipeline classifier) and close to the UKIDSS detection limit 
(a mismatch rate of 0.068 compared to 0.075 for the UKIDSS pipeline classifier). 
The Bayesian formalism developed here can be applied to improve the reliability of 
any star-galaxy classification schemes based on the measured values of morphology 
statistics alone. 
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1 INTRODUCTION 

Astronomical surveys now gather data on huge numbers of 
astronomical objects : the 2 Micron All Sky Survey (2MASS; 



Skrutskie et aljl2006h . the Sloan Digital Sky Survey (SDSS; 
York et alj|200oh and the UKIR T Infrared Deep Sky Survey 
fUKIDSS; lLawrence et a"l1l2007l ) have all identified hundreds 
of millions of distinct sources. The scale of these projects 
immediately necessitates an automated approach to data 
analysis (although an int riguing alternative is The Galaxy 
Zoo project described bv lLintott et alj|2008h . Considerable 
effort has been put into developing algorithms which can 
decompose an image into a smooth background and a cat- 



alogue of discrete objects, the properties of which must be 
characterised as well. Source positions, fluxes and shapes 
can all be estimated re liably by us i ng fairly simple moment - 
based approaches fe.g.. llrwinlll985l ; lBertin fc Arnoutslll996h . 
but the separation of point-like stars from more extended 
galaxies generally requires at least some external astrophys- 
ical information be included. As such, the problem of star- 
galaxy classification is well suited to Bayesian methods in 
which the measurements of a given source are combined with 
prior knowledge of the astrophysical populations of which 
the source might be a member. A practical formalism for 
Bayesian star-galaxy classification is developed in this pa- 
per. 
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In Section[2]the existing methods of star-galaxy classifi- 
cation are reviewed, with particular emphasis on those which 
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are at least partially Bayesian in nature. A general Bayesian 
formalism for star-galaxy classification is then developed in 
Section [3] and specialised to UK1DSS in Section [4] After 
analysing a simulated sample in Section [5] the real UKIDSS 
data are analysed - and the results compared to the classifi- 
cations from deeper SDSS data - in Section [()] The relative 
merits of the Bayesian approach to star-galaxy classification 
are summarised in Section [7] 

All photometry is given in the native system of the tele- 
scope in question. Thus SDSS u, g, r, i and z photometry is 
on the AB system, whereas UKIDSS Y, J, H and K pho- 
tometry is Vega-based. The relev ant AB to Vega conversions 
are given in lHewett et al.l (|2006l 1. 



2 STAR-GALAXY CLASSIFICATION 
METHODS 

The problem of systematically classifying astronomical im- 
ages as either point-like (i.e., generally stars, but also 
quasars, etc.) or extended (i.e., generally galaxies, but also 
Galac tic nebulae, etc.) goes back at least as far as iMessierl 
|l78lf ). and has been the subject of many investigations in 
the time since. This problem is fundamental to astronomy, 
but there is no universally agreed upon method of solving it, 
and an almost bewildering number of different approaches 
have been explored. This is because of varying desiderata 
(e.g., algorithm speed; degree of automation; efficiency ver- 
sus completeness; the desire for class probabilities versus 
absolute classification; etc.) and because different informa- 
tion (morphologica l and / or colour or even spectroscopic) is 
used. Hast ie et al.l (|2008t ) give a general review of classifica- 
tion methods, but there is no astronomy-specific equivalent, 
so the various relevant approaches are summarised here. 

The starting point for all methods of star-galaxy classi- 
fication is that stars and galaxies appear different, the latter 
being more extended (at a given flux level) and also exhibit- 
ing more variety. For bright sources these differences are eas- 
ily distinguished by the hum an eye (as demonstr ated so well 
by the Galaxy Zoo project; iLintott et al.ll2008h ; the chal- 
lenge is to develop automatic algorithms that can perform 
the same task from measured image properties. For well- 
measured, high signal-to-noise ratio sources that are much 
brighter than a survey's flux limit, star-galaxy separation 
can be achieved easily, and almost any sensible algorithm 
will achieve the desired results. The challenge is to treat 
faint sources correctly, extracting whatever morphological 
information is contained in the noisy measurements whilst 
also avoiding overly confident classification in situations of 
uncertainty. 

The most basic, and probably most commonly used, 
classification method is to make simple heuristic cuts in 
the space of observable image properties (and related statis- 
tics, such as the measured second-order moments or kur- 
tosis) . Cuts in this space are either chosen empirically 
(e.g-.lLeauthaud et alj | 2007l ; lKronlll980l ; [Yasuda et al"1l200ll; 
Irwin et al.ll20ld) or fit to the data (e .g., lMacGillivrav et al.l 
19761 ; iHevdon-Dumbleton et all 19891 ) . Such cut-based meth- 
ods of star-galaxy separation have a number of benefits: 
they are clearly defined; they are easy to repeat or simulate; 
and they correctly classify the majority of sources. However 
cut-based methods also have several important limitations: 



the choice of cuts can be essentially arbitrary; it is difficult 
to include information about the populations as a whole; 
they classify every source with certainty, which is almost al- 
ways unjustified close to the sample's magnitude limits; and 
(partly due to the definite classification) it is difficult to com- 
bine the potentially conflicting classifications from different 
bands or observations. 

The arbitrary nature of heuristic cuts can be avoided by 
using automated classification techniques. The use of neu- 
ral networks, such as multi-layer perceptr ons, to perform 
star-g alaxy classification was pioneered by lOdewahn et ahl 
l|l992h and forms a core part o f the astronomical image anal- 
ysis pa ckages S Extractor l|Bertin fc Arnoutsl Il996h and 
NExt (jAndreon et alj|2000h . The use of decision trees has 
also been explor ed, with both axis- parallel dWeir et al.lll995l : 
iBall et alj|2006l ) and oblique (|Suchkov et alj|2005l ) trees ap- 
plied with varying degrees of success. All the above classifi- 
cation methods are objective, but they are also opaque, and 
it can be hard to predict their behaviour outside the param- 
eter range in which they were trained and tested. The need 
for reliable training data can also be a problem, as this can 
require considerable human input and it is difficult to ensure 
that the necessary parameter range is covered. 

Any method which decisively classifies all sources has a 
fundamental problem. While the images of the bright sources 
in any sample generally contain enough information to jus- 
tify decisive classifications, many of the faint sources near a 
survey's limit should not be classified with such great cer- 
tainty. This issue has been tackle d using a number of differ - 
ent techniques: mixture models (Miller fc B rowning 2003); 
fuzzy fc-means cluster ing jMahonen fc Franttil 2000l ) : semi- 
supervised clustering (Jarvis fc Tvson|ll98ll ); and difference- 
boosting networks (|Philip et alj 120021 ) . These methods are 
capable of providing non-decisive classifications, but they 
still tend towards over-fitting in the absence of constraining 
population models. 

The critical point is that, for poorly-measured sources, 
there is potentially more information contained in the overall 
constraints on the star and galaxy populations than there 
is in the noisy image of the source in question. Including 
both types of information in a logically consistent way can 
be achieved by applying Bayes's theorem to obtain poste- 
rior class probabilities. Contaminated samples of stars or 
galaxies could be obtained by adopting probability cuts, 
but ideally the probabilities themselves would be retained 
for all sources. Even though the source populations are not 
known perfectly, reasonable - if imprecise - models should 
give more realistic results for faint sources than any method 
which does not account for the source populations at all. 

A fully principled Bayesian formalism for star-galaxy 
classification would involve using (parameterised) models for 
stars and galaxies to evaluate the conditional probabilities 
that a measured image was drawn from each of the two 
populations. Comparing these two model likelihoods then 
yields the posterior probability that a source is a star. For 
all its formal correctness, however, this is a very involved 
approach to inferring a single number. Indeed, none of the 
existing Bayesian implementations of star-galaxy classifica- 
tion (taken to include any method which uses information 
on the source populations as well as the target image) have 
gone to this extreme, and all adopt a variety of approxima- 
tions to make the problem more tractable. 
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Probably the most fully principled Bayesian star-galaxy 
classification algo rithms implemented to date are those of 
ISebokl l|l979h and lBazell fc Pend ^199$ ). who compared fits 
to the (calibrated) pixel values of the images. However the 
need to model, e.g., the spiral arms of brighter galaxies 
meant that, paradoxically, extra care had to be taken with 
the brightest images that should have been easiest to classify. 
This i s an e xample of the somewhat counter-intuitive result 
l| John! [l997l ) that attempting to use all the available data 
does not necessarily produce the most discriminating clas- 
sifier, es_p_e£iany_wdienjriac^ are used 



!r, esp 

(e.g.. lBazell fe Milled l2CX)5l : iBall et al.ll200i ) 



The problem of galaxy complexity can be overcome 
by using a small number of parameters - and preferably 
just one - to characterise how discrepant an image is from 
those of similar stars observed in comparable cond itions. 
Many morphology statistic s have been developed fe.g.. llrwir] 
1 19851 ; IScranton et al.ll2005h , and while they are generally not 
used in a Bayesian context, any such statistic can be used 
as a data surrogate. Thi s fact was utilised very effectively 
by IScranton et all {2002), who used the difference between 
the point-spread function (PSF) magnitude and the best 
fit galaxy profile model magnitude (defined as the concen- 
tration) as a measure of the extent of an image. However, 
rather than adopting parameterised models of the underly- 
ing star and galaxy populations, they fit a mixture model of 
Gaussians to the double-peaked r-band concentration dis- 
tribution in a number of discrete magnitude ranges. Overall 
this combines simplicity and clarity whilst retaining suffi- 
cient information from the image and the populations to 
make excellent classifications. An obvious extension would 
have been to comb i ne the data from all five SDSS bands (cf. 
iKoo fc Kronlll982l ; iLupton et al]|200lf ). In general this has 
proved problematic due to the combination of the different 
depths and the range of source colo urs that can, in par- 



ticular, result in non-detections (e.e 



Richards et ai]|2004 



ISuchkov et~al1l2005l and lBall et all 120061 all discard object 



that are not detected in all bands). 

Mu lti-band me a surem ents were used in a very differen t 
way by IWolf et al l i|200ll ) (see also iRichards et ail 12004 ) . 
who classified sources using colour data. They utilised kernel 
density estimation (KDE) to calculate class densities in the 
space of observable quantities (in this case measured colours) 
and then applied Bayesian model selection to obtain a final 
classification. The disadvantage of this approach is the need 
for a large training set (in order to run the KDE on the stars 
and galaxies separately). The use of the noise-convolved, 
rather than the intrinsic, distributions can also result in sub- 
optimal inferences due to the inevitably greater overlap of 
the observed distributions. 

Given the strengths and weaknesses of the various star- 
galaxy classification methods discussed above, we have in- 
vestigated the utility of a Bayesian approach in which the 
star and galaxy populations are modelled parametrically and 
in which the data from multiple observations can be com- 
bined. The focus is on trying to obtain the best classifica- 
tions for faint objects, with the provision that a decisive 
answer only be given if it is merited. 



3 PROBABILISTIC CLASSIFICATION OF 
ASTRONOMICAL SOURCES 

Suppose a noisy, seeing-smeared, and pixelated image of a 
source has been measured. What can be inferred about the 
type of object it is? Assuming there are iVt distinct popu- 
lations of astronomical] objects, t — {ti,t%, . . . ,fjv t }, under 
consideration, the fullest answer to this question is to use the 
available data, d — jdi, d,2, ■ ■ ■ , diV d }, to calculate the con- 
ilitiec], P 



ditional probabilit 
theorem yields 



Pr(t|ei), for each t. Applying Bayes's 



Pr(t|d) 



Pr(t)Pr(d|t) 
£?LiPr(*')Pr(d|t') ! 



(1) 



where Pr(t) is the prior probability that the source is of 
type t and Pr(ef|i) is the probability (density) of getting the 
observed data under the hypothesis that the source is of type 
t. Known as the evidence or the model likelihood, the latter 
is given by 



Pr(<f|i) 



Pr(0 t |t)Pr(d|6>t,t)d0id6» 2 . . . d0jv 



(2) 



where Pr(O t \t) is the usual unit-normalised prior distribu- 
tion of the N p model parameters, t = {61,62, ■■■ ,6n p }> 
that describe objects of type t, and Pr(d\9 t ,t) is the proba- 
bility (density) of measuring the observed data given a par- 
ticular value of this model's parameters (i.e., the likelihood). 

Whilst Eq. JTJ is a standard application of Bayes's theo- 
rem, its practical implementation is not so clear in an astro- 
nomical context. Demanding the prior distribution of each 
population's parameters be normalised to unity is awkward, 
as is the notion of a prior probability of the nature of a 
source. Out of context, the question 'What is the probabil- 
ity that a source is a star?' does not have a sensible answer, 
leaving the priors undefined. Some constraining information 
is required, such as a range of fluxes or colours, as all prob- 
abilities are conditional. The question 'What is the proba- 
bility that a source with a magnitude of i ^ 21.0 is a star?' 
does have a numerical answer, given approximately by the 
observed numbers of stars and galaxies down to the specified 
limit. This would yield a reasonable empirical value for the 
priors in Eq. although even here the answer depends 
on Galactic latitude, due the variation in the stellar den- 
sity. The implication is that the prior for each population 
would have to be defined differently for surveys with, e.g., 
different footprints on the sky or different depths, a far from 
satisfactory situation. 



1 The model selection approach followed here is conditional on 
the source being drawn from one of the astronomical popula- 
tions that have explicitly come under consideration. It would also 
be possible to include various non-astronomical noise processes 
amongst the models that might explain the data, such as cosmic 
rays and random noise spikes. The difficulty in implementing this 
idea is that, whereas most astrophysical populations are at least 
reasonably well constrained, the huge variety of poorly under- 
stood noise processes are far more difficult to quantify. 

2 Throughout this paper we have replaced the more formal 
Pr(T = t\D = d), where T is the object type variable and D 
is the random vector giving the available data, by the less cum- 
bersome, if occasionally ambiguous, Pr(i|d). 
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These ambiguities can be resolved by rewriting Eq. {TJ 



Pr(i|d) 



Wt(d) 



where we introduce the weighted evidence, 



W t (d) 



Pt (e t )Pr(d\9t,t)de 1 de 2 ... d^ 



0) 



(4) 



Here pt(0t) is the number density (per unit solid angle or per 
unit volume) of all type t sources - not just those that might 
be detected in the survey under consideration - as a function 
of their parameter^]. For Eq. @ to be valid, d needs to 
include whether or not the source has been detected, as well 
as its observed properties. 

The main benefit of using p t (6t), instead of the unit- 
normalised prior Pr(6t,t) = Pr(t)Pr(0t|£), is that the source 
density has an absolute, empirical and context-independent 
normalisation, given by the number of observed sources. 
Not being dependent on generally arbitrary parameter space 
boundaries, it is independent of the details of the current 
experiment, and needs only be calculated once. The detec- 
tion probability is included in Pr(d\6 t ,t), which is survey- 
dependent. 

Equations ([4]) and ((3J describe a general method for 
probabilistic classification of astronomical sources, by ex- 
plicitly combining the information contained in the mea- 
surements of a source with existing knowledge of the popu- 
lations from which it might have been drawn. When applied 
to the more specific problem of star-galaxy classification 
these equations simplify further still. 



3.1 Star— galaxy classification 

The probabilistic astronomical classification formalism de- 
scribed above can be applied effectively to star-galaxy clas- 
sification by making several simplifying assumptions: that 
every source is either a star or a galaxy; that the useful 
morphological information in an image can be compressed 
into a single statistic; and that the source flux is sufficiently 
well measured that the uncertainty in the photometry can 
be ignored. Each of these approximations means the resul- 
tant class probabilities are taken away from the ideal value 
that would be obtained if all the available information were 
utilised, but the implicit information loss is only signifi- 
cant to the degree it changes the final classifications. As the 
bright, well-measured sources in any sample will be success- 
fully classified by any sensible algorithm, it is only necessary 
to ensure that the useful information for the faint sources 
near the survey limit is retained. In the context of star- 
galaxy separation there is no benefit in trying to encode 
the wealth of morphological information present in, e.g., the 



3 In the simple case that Ot was a source's apparent magnitude 
in a given band, m, then pt(Ot) = Pt(m) would just be the num- 
ber counts in that band, but continuing, potentially unbounded, 
below the detection limit of the survey in question. The poten- 
tially infinite number of ultra- faint sources is irrelevant as pt(m) 
is multiplied by the likelihood [Pr(m|m) in this simple case] which 
ensures that the product of the source density and the likelihood 
is finite and that the integral in Eq. Q converges. 



image of a bright barred spiral galaxy - a statistic that ac- 
curately represented the degree to which a faint source is 
extended beyond the PSF is far more useful. The guiding 
principle in the approximations adopted here is whether they 
will significantly alter the classifications of the ambiguous 
faint sources. 

How many different populations should be considered 
for a typical source detected in an astronomical survey? The 
vast majority of known sources are either Galactic stars (i.e., 
t — s) or galaxies (i.e., t = g). The next most common are 
quasars; but, as their name suggests, most appear as point- 
sources in the optical or near-infrared (NIR) bands, and so 
can be included with the stars in the context of morphologi- 
cal classification. Hence the set of models can reasonably be 
reduced to t — {s,g}. Equation Q can then be simplified 
to give the probability that a source is a star as 



P s = Pr(s|d) = 



W s (d) 



W 3 (d) + W s (d)' 



(5) 



Thus the full result of the calculation is just a single number, 
P.. 

It is possible to simplify the problem of star-galaxy 
classification by considering only generic measurable prop- 
erties of a source. Following the arguments in Section [2] it 
is assumed that each of the available images of a source 
provides only a single morphology statistic, c, which en- 
codes the degree to which it is not point-like. There is 
great freedom in how c is constructed from the images, 
and even what the fiducial stellar value is. The key point 
is conceptual: the potentially large data and parameter 
spaces are both greatly reduced by the use of a single mor- 
phology measure. The relevant data are simply the mea- 
sured apparent magnitudes, {mi, rh%, . . . , mjv b }, and mea- 
sured morphology statistics, {ci, 62, . . . , cV b }, in each of 
the ./Vb bands in which measurements have been made 
and in which the source has been detected. In general it 
is also necessary to include the fact that the source has 
been detected at all, as this is significantly greater for the 
faintest point-like objects near a survey's detection limit 
than for extended sources. Hence the full data vector is 
d — {det, rhi, £1, m2, £2, . . . mjv b , CAr b }, where det encodes 
whether the source is detected or not. The parameters used 
to describe a source's observable intrinsic properties are its 
(true) apparent magnitudes, {mi, m<2, . . . , mjv b }, in each of 
the Nb bands, and its (true) morphology statistic^, c. The 
full parameter vector is then 8 = {mi , m2 , . . . , mjy b , c}. 



4 The notion of a true morphology statistic is somewhat artificial, 
given that c is generally defined in terms of image properties such 
as pixel values; however it is taken to be the value of the mor- 
phology statistic that would have been measured if the source was 
observed without photometric noise, but with the smearing of the 
observational PSF. As such c is not actually an intrinsic property 
of the source. Another potential ambiguity is that c could have dif- 
ferent values in each band, (e.g., due to star-formation regions in 
the arms of a spiral galaxy being more prominent in shorter wave- 
length bands), although such discrepancies would be strongest in 
the better-resolved, brighter galaxies that can be easily classified 
anyway. 
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Substituting the above definitions of d and into 
Eq. Q, the weighted evidence can be written as 



is background-dominated. The uncertainty for a source of 
magnitude mb in band b is then 



W t (d) = J pt(mi,m2,...,m» b ,c) 
Pr(det,mi,ci,rh 2 ,c 2 , . . .fh Nb ,c Nb \m 1 ,m 2 , . . . ,m Nb ,c,t) 



(6) 



dmi dm2 



dm at. dc. 



Note that, due to the choice of observable model parame- 
ters, the likelihood now has the same form for both stars and 
galaxies, whereas in Eq. ([2]) it was population-dependent (as 
there was the possibility of using intrinsic physical param- 
eters spectral type or Hubble type, which are only denned 
for stars and galaxies, respectively) . The form of the popu- 
lation density and the prior can now be treated separately, 
and both can be usefully simplified further. 

The likelihood should encode photometric uncertain- 
ties and the limitations of the morphological measurements, 
as well as correlations between measurements in different 
bands. It is, however, reasonable to assume that inter- 
band photometric no ise correlations are negligible (but see 
IScranton et~alll2005f ). in which case the likelihood becomes 
a product over the Nb bands. It is also reasonable to assume 
that the photometric part of the likelihood is Gaussian in 
magnitude units - w hilst this approximati on breaks down 
for faint sources (e.g., iMortlock et ai1l2010r i, all the sources 
here are unambiguously detected. It is, however, necessary 
to include the survey incompleteness, expressed here as the 
probability that a source is detected in at least one band 
(or, more specifically, in a reference band). The detection 
probability is assumed to drop from unity to zero over a 
magnitude range Ami, around the nominal detection limit 
of the survey, mii m ,6. The specific form adopted for the in- 
completeness is 



Pr(det|m(,) = ierfc ( mb milm > b 
2 \ Ami 



(7) 



where erfc(x) = 2 M{2 1/2 x'; 0, 1) da;' - 1 is the comple- 
mentary error function, and Af(x;n,a) — exp{ — l/2[(x — 
/x)/o"] 2 }/[(27r) 1 ^ 2 (r] is the unit-normalised Gaussian proba- 
bility density with mean /j, and variance a 2 . Although the 
detection limits for stars and galaxies are likely to be similar, 
the tail of this distribution is significantly longer for stars 
(as, being more centrally concentrated, there is a greater 
chance of faint stars meeting the detection criteria of most 
surveys). A somewhat subtle result of this is that the ma- 
jority of the very faintest sources in a sample generated in 
this way are stars, even for surveys that are sufficiently deep 
that galaxies are intrinsically much more numerous at such 
faint fluxes. 

Combining the above assumptions, the likelihood for 
stars and galaxies becomes 

Pr(det,mi,ci,m 2 ,c 2 , . . . fh Nb , cjvjmi, m 2 , . . . ,m Nb ,c,t)(8) 

= Y\ N I m i>; m &> °"i)( m ft)] Pr(cb|c), 
i>=l 

where at(m) is the magnitude-dependent noise in band b.For 
the fainter sources of most interest here (i.e., those within 
a few magnitudes of the relevant detection limit), the noise 



o-b{m b ) = -10 
5 



2/5(r7 



(9) 



where m\i mi b is the limiting magnitude in band b, at which a 
source would be detected with, on average, a signal-to-noise 
ratio of 5. 

The sampling distribution of c% is not as generic as the 
distribution of rhb as Cb is necessarily a more complicated 
statistic, the defin ition of which is su rvey-dependent. A com- 
mon choice (e.g., Ilrwin et al.l 2010) for stars at least, is to 
define c such that Pr(ct|c) = A/"(c6;0, 1) by construction, 
although even in such situations this relationship is not al- 
ways satisfied empirically (cf. Section [43} . Combined with 
the fact that almost nothing can be said about the form of 
Pr(cf,|c) in abstract, it is left general for the moment. 

The source density pt(mi, m%, . . . , mjv b , c) plays several 
distinct roles in Eq. ([5}, most obviously encoding the rela- 
tive numbers of stars and galaxies at a given magnitude, 
but also implicitly including their distribution of colours. 
Making this distinction allows the more abstract source 
density to be separated into the number counts in a ref- 
erence band, dNt/dm, the conditional distribution of the 
(true) morphology statistic, Pr(c|ra, t), and a conditional 
magnitude-dependent colour distribution, Pr(mi — m 2 , m 2 — 
rri3, . . . , mjv b _i — m,M b \m). The likelihood could also be re- 
written as a function of one reference magnitude and colour 
terms mi — m 2 , m 2 — 7713, etc., without loss of informa- 
tion. One important implication is that it is formally impos- 
sible to separate colour and morphological information in 
attempting to perform star-galaxy separation using multi- 
band data. The fact that the morphology statistic of a 
galaxy depends on its magnitude means that some colour- 
dependent calibration of this relationship is required and 
that this is different for stars and galaxies due to their differ- 
ent colours. From a Bayesian perspective this is very natural: 
all the available data (and external information) should be 
brought to bear in any inference problem, with any separa- 
bility falling out as a matter of course. However star-galaxy 
classification is often an intermediate step towards a specific 
science goal, including potentially exploratory work such as 
searching for unusual objects. In such cases it is often de- 
sirable to use colour inf ormation alone (e . g., to search for 
compact galaxies, as in IDrinkwater et al.1 [2003) or to use 
morphological information alone (e.g., to search for point- 
sources with unusual colours), but Eq. @ shows that the 
two are inextricably linked. Indeed. [Baldrv et alj (|2010l ) use 
morphology jointly with colour information to perform the 
galaxy target selection for the Galaxy And Mass Assembly 
(GAMA) survey. It is possible to produce heuristic statis- 
tics which depend only on colour or morphology, but a self- 
consistent Bayesian approach to star-galaxy classification 
must include both - or make significant approximations. 

It is the latter approach that is followed here, by the 
potentially extreme step of ignoring the uncertainty in the 
measured photometry, and instead treating a source's mea- 
sured magnitude in each band, rhb, and its true magnitude, 
m b , as identical. This approximation is only justified be- 
cause of this peculiar nature of the problem at hand. Given 
that the colour information is going to be ignored per se, the 
only role it will play in the model is to allow the morphology 
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statistics of a source to be compared across bands. For ex- 
ample, the values of P s calculated for two sources of different 
colours, but with the same values of ci and £2, in two bands 
could be quite different if only one was bright enough to be 
well classified in a certain band. Provided that the typical 
value of c for an object of type t does not vary rapidly with 
its magnitude, it is a reasonable approximation to adopt the 
average colour relationships for each population. 

Applying the above simplifying assumptions to Eq. ©, 
we obtain our final general, if approximate, expression for 
the weighted evidence, 

W t {d) = (10) 

div r Nb 

Prfdetlm,*) / Prfclm = m, t) TT Pr(cJe) dc, 

dm . 

m—m " b=l 

where m is the measured magnitude in the reference band 
and dNt/dm are the differential number counts of type t 
sources in this band. Note that the photometric data on the 
source in question only enters Eq. (|10[) in the estimate of 
the number counts and the estimate of the true morphol- 
ogy statistic in each band. The source's measured values of 
the morphology statistic in each band are used, however, 
entering through the likelihood terms of the form Pr(c{,|o,). 
Whilst it is impossible to fully escape the link between the 
measured shapes and colours of an object, this formalism 
emphasizes the former as much as is possible. 

Despite the many simplifications that have been made 
to obtain Eq. (|10[) . the presence of the survey-specific mor- 
phology statistic means that a more specific form of Wt(d) 
can only be obtained in the context of a specific survey or 
data-set. The variation in image quality and depth, com- 
bined with the different choices of morphology measure 
mean that the form of P s that would be obtained by in- 
serting Eq. (|10[) into Eq. ([3| is our final generic result. 



K bands (defined in lHewett et al.| [2006) to avera ge depthjf 
of Y ^ 20.2, J ~ 19.6 H ~ 18.8 and K ~ 18.2 (|Dve et aL 
120061 ; I Warren et alj|2007h . The UKIDSS dat a are obtained 
from the WFCAM Science Archive^ (WSA; lHamblv et all 
2008), which supplies both images and processed catalogues 
of detected sources. 

Aside from basic image parameters (e.g., positions, 
counts, etc.) these catalogues include a number of derived 
statistics, including an exte ndedness statistic in each band. 
The statistic, as defined in llrwin et all l|2010l) , is based on 
the fact that all the unsaturated stars in each field have the 
same average curve of growth (i.e., fraction of their total 
flux as a function of angular radius). This average can be 
measured empirically, and a mismatch statistic calculated 
for each source. In a given magnitude range the statistic 
is scaled so that, for stars, it has zero mean, unit variance 
and is approximately Gaussian distributed; this scaled mis- 
match statistic is referred to as ClassStat in the WSA. Ex- 
tended galaxies (and blended pairs of sources) have positive 
ClassStat values, whereas most noise sources (e.g., cosmic 
rays), being more compact than the PSF, have negative 
ClassStat values. ClassStat encodes much of the impor- 
tant morphological information in even faint images, and 
is a superb morphology statistic. However because it is a 
statistic based solely on the image data (i.e., it does not 
include prior information about a source's nature) it can- 
not encode all the information about a source (as distinct 
from the image of it). Moreover, there is no well-motivated 
method of combining the ClassStat values obtained from 
multiple measurements of a source. (In UKIDSS there are 
combined source probabilities and ClassStat values are re- 
ported, but these are heuristic in nature, and do not retain 
all the information present in the band-specific ClassStat 
values.) 



4 STAR-GALAXY CLASSIFICATION IN 
UKIDSS 

The Bayesian approach to star-galaxy classification de- 
scribed in Section [3] is reasonably general and could be ap- 
plied to generic optical or NIR observations. However the 
need for explicit population models means that its perfor- 
mance can only be examined in the context of specific com- 
bination of bands, depths and image quality, i.e., a partic- 
ular survey. For the purpose of exploring our Bayesian ap- 
proach to morphological classification we analyse data from 
the multi-band UKIDSS imaging survey (Section 14. II) . util- 
ising the overlap with the deeper SDSS Stripe 82 region 
(Section 14. 2[) to provide a verification sample. 

4.1 UKIDSS 

UKIDSS (|Lawrence et all 120071 1 is a suite of five sepa- 
rate NIR surveys using the Wide Field Camera (WFCAM; 
ICasali et all l2007h on the United Kingdom Infrared Tele- 
scope (UKIRT) . A detailed tech nical description of the sur- 
vey is given bv lDve et al] (2006), alt ho ugh there have bee n 
several improvements in the time since l| Warren et al.ll2007t) . 
In particular, we analyse the UKIDSS Large Area Survey 
(LAS), which includes imaging in the UKIDSS Y, J, H and 



4.2 SDSS Stripe 82 

The SDSS (|York et al.l 120001 ) has surveyed ~ 10 4 deg 2 
with single observati ons in the u, g, r, i and z bands 
l|Fukugita et al.l Il996l ). to depths of u ~ 22.0, g ~ 22.2, 
r ~ 22.2, i ~ 21.3 and z ~ 20.5. The SDSS has also taken 
repeat measurements in the Stripe 82 region (covering the 
right ascension range a ^ 60 and a ^ 300 deg and declina- 
tions of \S\ ^ 0.1), reaching depths of u ~ 23.6, g ~ 24.5, 
r ~ 24.2, i ~ 23.8 and z ~ 22.1. 

The SDSS approach to star-galaxy classification is 
based on the use of model magnitudes, each detected source 
being fit as both a point -source (i. e., the meas ured point- 
spread function) and a galaxy (i.e., a Sersic 1963 profile with 
one of two different exponents). The difference between the 
two different magnitudes, termed the concentration, c, is 
then used as a morphology statistic l|Yasuda et al.l 120011 ). 
The basic classification is done by designating sources with 
c ^ 0.145 as stars and sources with c > 0.145 as galax- 
ies. Whilst this scheme is very effective, it is also important 



5 Depths are given in terms of the magnitude of a point— source 
that would, on average, be detected with a signal— to-noise ratio 
of 5. 

6 The WSA is located at http://surveys.roe.ac.uk/wsa/. 
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to note that the classificatio ns of up to a third of sources 
contradict in different bands |Yasuda et al.ll200lh ■ 

The Stripe 82 data are significantly deeper than the 
UKIDSS LAS (in the sense that all but the reddest sources 
are detected with a greater signal-to-noise ratio in Stripe 82 
than in the LAS, and an average UKIDSS-selected source 
has o r ~ 0.1 cry). Even though the SDSS optical imaging 
has a significantly larger seeing (~ l'.'2) than the UKIDSS 
NIR data (~ 0'.'8), the SDSS Stripe 82 data of the morpho- 
logically ambiguous sources near the LAS detection limit is 
able to separate point and extended sources reliably. This 
is illustrated by Figs. □ Hand Fig. [Q shows SDSS r-band 
concentration plotted against UKIDSS Y-band ClassStat. 
For the faintest two magnitude bins (Y ~ 19 and Y — 20) it 
is impossible to identify two different populations of sources 
along the horizontal (ClassStat) axis, whereas this is still 
possible along the vertical (concentration) axis. This is con- 
firmed by the one-dimensional histograms of both classifica- 
tion statistics [Figs. [2] (concentration) and [3] (ClassStat)]. 
For Y — 19, in Fig.|3j the two populations of sources have al- 
most completely merged, even though the histogram is still 
bi-modal and for Y ~ 20 the two populations of sources 
cannot be distinguished at all. However the corresponding 
histograms for SDSS concentration clearly show two distinct 
populations of sourcefl In particular, for Y ~ 20, the SDSS 
r-band class labels misclassify only ~ 4% of sources (this 
number is obtained by fitting a Gaussian distribution to the 
star population and a log-normal to the galaxy population 
for the SDSS concentration data). This is a very good result 
when compared to the UKIDSS ClassStat data which, at 
this faintness regime, no longer allow a separation into two 
populations of sources (Fig. [3j . 

Hence, for the purpose of star-galaxy separation, we 
treat the SDSS Stripe 82 data as definitive classifications 
against which our Bayesian LAS classifications can be 
tested. 

4.3 Test sample 

Our starting point is a sample of 121 902 UKIDSS sources 
in a 14.4 deg 2 area defined by right ascensions of either a ^ 
60 deg or a ) 300 deg and declinations of \8\ ^ 0.1. This 
area is entirely within the SDSS Stripe 82 region, and has 
been covered by UKIDSS in the Y, J, H and K bands. 
Our main aim is to classify these sources and compare the 
results to the SDSS Stripe 82 classifications. But to do so 
requires the preliminary task of generating the magnitude- 
dependent prior distributions of ClassStat, along with the 
star and galaxy number counts. This is not part of the actual 
classification process (i.e., it is independent of any single 
source), and so is considered separately from the results. 

4.4 Number counts 

The number counts of stars and galaxies provide the prior 
that will be used to classify sources for which the image data 

7 From Fig. ll4l it is clear that, for r > 20.5 the two clearly distinct 
populations of stars and galaxies merge. By limiting ourselves to 
sources with 16 ^ r ^ 20.5 (thus also avoiding saturated sources) 
we assume the SDSS class labels to be correct. 




16 17 18 19 20 21 

Y 



Figure 4. Differential number counts of all sources (black), stars 
(blue) and galaxies (red) from UKIDSS observations. Classifica- 
tions are obtained by using our model with number counts ob- 
tained by binning the data into equal-sized magnitude bins and 
fitting simple mixture models to the cy data in each bin. Also 
shown as dashed lines are the model fits (see Eq. l(TTJ), both with 
and without a correction for incompleteness. 



are ambiguous. The counts could be obtained from deeper 
surveys (although none exist in all the UKIDSS LAS bands) 
or from physical models of the source populations (although 
this would be unnecessarily complicated) . For the restricted 
problem of star-galaxy separation, however, it is more direct 
to fit the star and galaxy counts to the target sample itself. 
At the bright end the numbers are given directly by the data; 
at the faint end it is also necessary to assume some weak 
prior information (essentially that a smooth extrapolation 
from the bright counts is reasonable). 

For the UKIDSS LAS we have chosen the Y band as the 
reference bancjf]. The observed Y band counts of stars and 
galaxies (identified here by using our model with number 
counts obtained by binning the data by magnitude and in- 
terpolating the parameters) from the test sample described 
in Section 14.31 are shown in Fig. [4] Both exhibit exponen- 
tial counts down to Y ~ 19, beyond which the survey 
incompleteness dominates (as expected, given the average 
UKIDSS LAS limit of Y ~ 20.2). For both stars and galax- 
ies the intrinsic number counts are taken to be of the form 

Pt(Y) = ^=a t ln(W)N t W atXY - Yo \ (11) 

where N t is the number of sources (optionally per unit solid 
angle, although this detail is unimportant as long as the 

8 As some sources have not been observed in all of the bands, 
for rh we chose the average of the magnitudes mj, in the bands in 
which a given source has been observed. To convert all of these 
magnitudes onto the scale of the reference band we have added 
the average colours Y—J, Y—H, Y—K to the magnitudes riij,. 
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Figure 1. SDSS r-band concentration plotted against UKIDSS Y-band ClassStat for different magnitude bins. 



same normalising convention is used for stars and galaxies) 
of type t brighter than the reference magnitude Yo, and at is 
the type-dependent logarithmic slope. Even though Yo and 
Nt are degenerate it is convenient to set Yo to the Y"-band 
magnitude limit, in which case Nt is approximately equal to 
the number of objects of type t in the sample. 



In order to fit these parameters, however, it is necessary 
to account for the incompleteness in each band, denoted 
here as Pr(det|Y), which was introduced in Eq. ([7]). The 
magnitude limit mu m ,(, and incompleteness range Ami, are 
fit in the Y, J, H and K bands for both stars and galaxies. 
Fitting diVt/dYPr(det|Y) to the observed UKIDSS counts 
yields the fits shown in Fig. [3] Although there are some 
discrepancies, the key point is that the relative numbers of 
stars and galaxies at a given magnitude will give far more 



accurate prior probabilities than, say, an uninformative prior 
[i.e., Pr(s) = Pr(g) = 0.5 for all sources]. 



4.5 ClassStat distributions 

ClassStat is constructed so that, on average, c = for stars 
and c > for extended sources. We observe c however, the 
distribution of which, for isolated stars should be normal 
(with zero mean and unit variance), again by construction. 
However the observed ClassStat distribution of bright stars 
(defined as UKIDSS sources with 13 < Y < 17 and \c Y | < 6) 
shown in Fig. [5] appears to be sign ificantly non-Gaussian . 
This impression is confirmed by the lShapiro fc Wilkl (Il965h 
and one-sample Kolmogorov-Smirnov (|Conoverl 1 19991 7 nor- 
mality tests. 

The distribution of ClassStat values for the bright 
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Figure 2. One-dimensional slices of SDSS concentration data for different magnitude bins. 



stars has a slightly negative mean, and is weakly positively 
skewed. Due to the positive skewness, using a symmetrical 
distribution with larger tails than a normal (such as Stu- 
dent's t-distribution) will not result in a good fit. For the 
observed ClassStat distribution we have instead adopted a 
Gaussian mixture model of the form 

Pr(cb|c) = aAf(cb — c; /Ui, 1) + (1 — a)J\f(cb — c; 0- 2 ), (12) 

where, for stars, c = 0, and fj.i, fi2 and 02 are free parameters 
to be fit. These were fit using a simple maximum likelihood 
(ML) approach in each of the four UKIDSS bands. The re- 
sulting values are given in Table [431 and the Y band fit is 
compared to the data in Fig. [5] 

We used the Bayesian information criterion (B1C; 
ISchwar j Il97gl ) to assess the model fit. As expected, the 
Gaussian mixture model is a considerably better fit to the 



data than either fiducial unit-variance Gaussian, or the 
Gaussian with ML parameters, resulting in significantly 
lower BIC values. 

The distribution of c is more complicated for galaxies 
than for stars, both because galaxies are intrinsically more 
varied, and also because the definition of the morphology 
statistic is essentially independent of galaxies' properties. 
For the UKIDSS sample an empirical function was sought 
which could represent the distribution of galaxies' c values 
as a function of magnitude. Particular care was taken to 
ensure a good fit close to the survey's limit, for which there 
is minimal morphological information and c — > 0, even for 
galaxies. 

These desiderata are met by a log-normal distribution: 
Pr(c|m, t = g) = C[c; n(m), a(m)], (13) 
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Figure 3. One-dimensional slices of UKIDSS ClassStat data for different magnitude bins. Also shown is the fit of our model [overall 
probability density (black line), star class probability density (blue) and galaxy class probability density (red)], which is discussed in 
Section 14.61 and further illustrated on Fig. [6] 



where 



C{x;n,a) = 



■. exp 



Kg) - iA 2 

2a 2 



(14) 



deviation <r'(m) of the log- normal distributiorjf] by the em- 
pirical functions below, 

S(m)=(l--P-) (15) 

\ '"'max / 

x {[i/i(m — va) 2 + v%{m — va) + vz]" 5 + v& } , 
a' {m) = ni 10 m(m - 11)+ \ (16) 



Rather than specifying the functions /i(m) and a{m) of 
the standard parameterisation of the log- normal distribution 
(Eq. I14[) . we have modelled the mean fi'(m) and standard 



9 Here, fj,' and a' are the mean and standard deviation of a 
random variable the logarithm of which is normally distributed 
with mean fi and standard deviation a. These parameters are 
related via a standard distributional result: fi' = e^ +,y / 2 and 
a' 2 = (e^ 2 - l)e 2 ^+° 2 . 
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Figure 5. The empirical distribution of ClassStat values of 
bright (13 < Y < 17) UKIDSS sources with \cy\ ^6 (this se- 
lection region is shown on Fig. [9}. Also shown is a A^(0, 1) normal 
distribution and the best-fit Gaussian mixture model defined in 
Eq. pi 



Figure 6. The distribution of UKIDSS sources (black points) and 
the model (contours) in the H band. The case for the H band is 
plotted as the saturation of bright sources is not as apparent in 
the Y band. One-dimensional plots of the model fit (this time for 
the Y band) are shown on Fig. \3\ 



where m max is the upper detection limit in the reference 
band and v\, V2, V3, v±, V5, Ve, T]i and 772 are free parameters 
fitted by a simple least-squares (LS) procedure. 

The stellar and galactic densities implied by our mod- 
els are shown as contours in Fig. [6] along with the sam- 
ple from which the fit was derived. (The H band, rather 
than the Y band, was chosen as it has the highest num- 
ber of saturated sources, thus emphasizing an aspect of the 
data that is not included in the model.) The fit is not per- 
fect (e.g., the true density is underestimated at the bright 
end and slightly overestimated in two regions near the faint 
end), but is very good. Also, the bright UKIDSS stars (with 
H < 12.5) have significantly positive ClassStat values, as 
they are saturated; we do not attempt to include this phe- 
nomenon as essentially all sources bright enough to be sat- 
urated in UKIDSS images can be classified as stars on the 
basis of prior information. 

4.6 Simulated data 

Given that the distribution of magnitudes and morphol- 
ogy statistics described above was developed sequentially, 
it is important to perform an end-to-end test of the entire 
model. 

The first stage of this was to generate a sample of sim- 
ulated sources from the model. The algorithm for doing so 
can be broken down into several steps: 

• Draw a true Y band magnitude from the total (star + 
galaxy) number count model given in Eq. (|11[1 . 

• Determine the type (star or galaxy) of the object from 
the relative number counts at this Y band magnitude. 

• Use the average Y—J, Y—H and Y—K colours for stars 
and galaxies (as shown in Fig. I10|l to obtain J, H and K 
band magnitudes. 



• Record the object as being detected in each band with 
probability given by the incompleteness formula in Eq. l[7[l. 

• Add observational (sky) noise to the true magnitudes 
in all bands by sampling from a Gaussian distribution with 
zero mean and band-dependent standard deviation given by 
Eq. ©. 

• Generate ClassStat values for each band by sampling 
c from Eq. (|13|) for galaxies, setting c = for stars and then 
sampling from the mixture model given in Eq. <|12[) . 

Fig. [8] shows a sample of data generated by the above 
procedure. Having verified that generating sources from our 
model can accurately mimic the relevant UKIDSS data, the 
model can now be used with confidence as the prior needed 
to perform Bayesian star-galaxy classification. 



5 ANALYSIS OF SIMULATED UKIDSS DATA 

A first test of our Bayesian star-galaxy classification method 
is to analyse the simulated UKIDSS data described in Sec- 
tion 14.61 As the input star and galaxies distributions are 
known, the resultant stellar probabilities are, given the de- 
liberately imposed restrictions on the use of colour infor- 
mation, optimal. In particular, the numbers and properties 
of the sources which cannot be classified decisively are of 
interest, as any real sources with such properties will have 
P s ~ 0.5. 

The distribution of posterior star probabilities for all 
sources is shown in Fig. [7J and the distribution in Y vs. cy 
space is shown in Fig. [8] These results from simulated data 
can be compared to Figs. [11] (left) and 1121 (left), which show 
the results when our method is applied to real UKIDSS data. 
While there is not much difference between Figs. [8] and [12] 
(left), there are two noticeable differences between Figs. [7J 
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Table 1. Maximum likelihood values, with corresponding standard errors in brackets, 
for the parameters of the Gaussian mixture model used for the observational noise. 

band a fii /12 o"2 

Y 0.9453 (0.0030) 0.1418 (0.0085) 2.3950 (0.1501) 3.2021 (0.08600) 

J 0.9436 (0.0053) 0.1131 (0.0143) 1.1879 (0.2379) 3.7199 (0.1776) 

H 0.9601 (0.0033) 0.1266 (0.0117) 3.3037 (0.2922) 3.7523 (0.1617) 

K 0.9474 (0.0039) 0.0360 (0.0118) 3.6881 (0.2463) 3.4449 (0.12975) 



s 
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Figure 7. Histogram of the posterior star probabilities, P B , eval- 
uated for simulated UKIDSS data. 




1.0 



-0.8 



-0.6 



0.4 



-0.2 
1 0.0 



Table 2. Fraction of sources with posterior probabilities be- 
tween 0.4 and 0.6 for both the single-band models and the 
joint model. The fractions for the joint model are not the 
same across the four bands as we only consider sources that 
are observed in the given bands. So while the probabilities 
for the joint model are obviously the same across all bands, 
the fractions in the table above vary across bands as the 
number of observed sources vary across bands. 



band 



Y 



J 



H 



K 



-10 10 20 30 40 

F-band ClassStat.cy 



50 



single-band model 0.0332 0.0384 0.0322 0.0336 
joint model 0.0254 0.0254 0.0201 0.0155 



and [TT] (left): there are more simulated sources with low 
star probabilities and there are more sources with P s clearly 
different from and 1 (i.e., not classified with certainty). 
In particular there are many more sources with P B < 0.4, 
yet clearly non-zero. The former difference can be explained 
by the fact that there are fewer bright sources (which are 
predominantly stars and hence have high star probabilities) 
among the generated data. This means that for equal sample 
sizes there will be more sources with low star probabilities 
in the simulated sample when compared to the original data 
sample. The increase in sources with less definite classifi- 
cations is due to the fact that, as acknowledged in Section 
3.11 our model is not designed to take inter-band photomet- 
ric noise correlations into account. Thus the simulated data 
sample contains more sources with seemingly contradicting 
ClassStat data in the different bands than a sample of real 
data. Both of these differences have only a small effect on 
the simulated data, and should affect the classification of a 
negligible number of real sources. 



6 RESULTS 

The Bayesian method of star-galaxy classification described 
above was applied to the sample of UKIDSS sources in the 
SDSS Stripe 82 region, giving single-band star probabilities 
for every source detected (in each band in which the source 
was detected), as well as combined probabilities. The gen- 
eral properties of the classifier are discussed in Section 16.11 
and then compared to the UKIDSS classifications (in Sec- 
tion [521 and the SDSS classifications (in Section f6.3p . 



Figure 8. Combined star probabilities derived from our Bayesian 
method for simulated UKIDSS data. 



6.1 Properties of the classifier 

Figure [9] shows the single-band posterior star probabilities 
in Y — cy space. These can be compared with the probabil- 
ities obtained by using the full multi-band model (Fig. I12p . 
The most notable difference is that for the latter case there 
seem to be fewer sources which confound the classifier, i.e., 
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Figure 9. Single-band star class probabilities (Y band). The dot- 
ted box represents the selection region for the sources from Fig. [5] 



Figure 10. Colour-colour plot of the posterior star class proba- 
bility vector 



with P s ~ 0.5. Table 16.11 lists the fraction of sources for 
which the classifier gives 0.4 ^ P s ^ 0.6. Compared to the 
single-band model, there is a decrease of at least 25 per cent 
in this number for the combined model. While a reduction 
in the classifier-confounding region is not always desirable, 
here this decrease translates the fact that the classifier will 
be at a loss only when the data from different bands are 
contradictory, or when a source's type is unclear in all the 
bands in which it was detected. 

Figure [10] shows the distribution of the posterior star 
class probabilities over Y—H vs. H—K space. Even though 
the model has not been designed to optimise class separation 
in colour-colour space, there are two clearly distinct popula- 
tions. Furthermore, sources with low star probabilities have 
Y—H ~ 1.5 and H—K ~ 0.8, as expected. 

6.2 Comparison with UKIDSS pipeline 
classifications 

Figure [12] (left) shows the posterior stellar probabilities in 
the cy vs. Y plane (the choice of band is unimportant, as the 
J, H and K band plots are similar). It is clear that for the 
overwhelming majority of objects, in particular those with 
either Y < 18 or cy > 5, the Bayesian classifier gives very 
definite classifications (i.e., values close to either or 1). 
Unsurprisingly, the region where the classifier is most often 
confounded is where the star and galaxy loci merge. Indeed, 
as the two loci overlap completely at the faint end, there 
is very little information regarding object class to be ex- 
tracted from the measured ClassStat values, and the prior 
knowledge drives the classification. 

One of the main aims of our classifier is to make the 
fullest possible use of whatever morphology statistic is avail- 
able - the UKIDSS ClassStat statistic in the case consid- 
ered here - and in particular for sources where it has been 
measured in multiple bands. Several heuristic methods are 



used to combine multiple measurements in the WSA, includ- 
ing simple averaging and a plausible - but again heuristic - 
contingency table for sources where the ClassStat measure- 
ments in different bands imply contradictory classifications. 
Our Bayesian method has the potential to propagate all the 
information contained in the individual c values correctly, 
albeit at the cost of introducing an explicit - and compli- 
cated - model. 

The UKIDSS pipeline posterior star probabilities can 
be compared to that from our model (Fig. I12[) . Both clas- 
sifiers yield similar posterior star probabilities for sources 
which are fairly bright and/or have large ClassStat values, 
but deal differently with faint sources with small ClassStat 
values. Apart from a slight shift to the left at the faint end, 
the UKIDSS pipeline classifier can be seen to consist essen- 
tially of a vertical cut on the ClassStat value. The classifier- 
confounding region (i.e., where the classifier outputs prob- 
abilities near 0.5) is fairly small, and, crucially, does not 
widen at the faint end. Our classifier, however, through the 
input of prior knowledge, is not limited to taking a verti- 
cal cut and the classifier-confounding region is larger, par- 
ticularly at the faint end. Indeed, near the detection limit, 
the ClassStat values carry almost no information concern- 
ing object type, as stars and galaxies have similar values 
at those fluxes. It thus makes very little sense to base a 
classification on that information. Using prior knowledge is 
vital for such faint sources. Our classifier allows a continu- 
ous transition from ClassStat value based classification to 
prior knowledge based classification. The resulting broader 
classifier-confounding region is not a drawback: if an object 
has P s ~ 0.5, it means that, given the observed data, it is 
impossible to tell whether that source is a star or a galaxy. 
Artificially coercing posterior classifications to be unambigu- 
ous is wrong. If a source cannot be reliably classified, then 
its posterior probability should reflect this. 

Both the posterior probabilities computed by our classi- 
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Figure 11. Histograms of the posterior star class probability vectors for both our model and the UKIDSS pipeline 




Figure 12. Combined star probabilities derived from our Bayesian method (left) and the UKIDSS pipeline (right) as a function of the 
measured Y band ClassStat and magnitude. 



fier and the original, observed ClassStat values can serve as 
indicators of source type. While one should take a source's 
flux into account when assessing its ClassStat data (cf. 
Fig. I12[) , ClassStat is designed so as to differentiate be- 
tween resolved and unresolved sources, and is indeed used 
to this purpose by the UKIDSS pipeline. Hence it makes 
sense to compare the posterior class probabilities directly 
with the ClassStat values. 

Figure [13] summarises the situation for different mag- 
nitude regimes. At fairly bright magnitudes (i.e., Y ~ 17) 
most sources have P s ~ 1, except for obviously extended 



sources with very large ClassStat values. At the faint end 
(Y ~ 20) the classifications are not so decisive with few 
sources having P s ~ or P s ~ 1. The depth of the UKIDSS 
LAS is such that the surface density of stars and galaxies 
is comparable at the survey's magnitude limit. This is the 
most interesting regime for star-galaxy classification prob- 
lems: as significantly shallower or deeper surveys would be 
dominated by stars or galaxies, respectively, at their magni- 
tude limit, and so essentially all the poorly measured sources 
would be decisively classified purely by the population prior. 

However, very low star probabilities (P s < 0.1) are only 
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Figure 13. posterior star probabilities plotted against MergedClassStat for different magnitude bins. 



reached when ClassStat exceeds a certain threshold. In the 
region where star and galaxy populations merge (in mag- 
nitude vs. ClassStat space; Y — 19) a trend is apparent: 
large ClassStat values result in low posterior star probabili- 
ties. However the reverse is not true: except for sources with 
extremely low (cy < 0) or high (cy > 10) ClassStat val- 
ues, a source's star probability does not reveal much about 
its ClassStat value. In the regions where stars and galaxies 
are fairly well separated (Y ~ 17 and Y ~ 18), there is a 
good correspondence between posterior star probability and 
ClassStat. 



6.3 Comparison with SDSS Stripe 82 
classifications 

Figure [14] shows the posterior star probabilities from our 
model as a function of SDSS concentration and r-band mag- 
nitude. The dotted line indicates the threshold concentration 
value (0.145) for SDSS star/galaxy labels. Overall there is 
good agreement with most sources with low P s lying to the 
left of the line and sources with high P s lying to the right. 

For sources classified with great confidence by both clas- 
sifiers [i.e., fairly bright, but non-saturated sources (16 < 
r < 21.5) with corresponding UKIDSS posterior star proba- 
bilities above 0.9 and SDSS concentration below 0.05 or pos- 
terior star probabilities below 0.1 and concentration above 
0.2], we can study those sources for which the two classifiers 
disagree. Figure [15] shows that most such sources lie right 
between the star and galaxy loci. 
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Figure 14. UKIDSS posterior star probabilities shown as a func- 
tion of the measured SDSS Stripe 82 concentration vs. r-band 
magnitude. Sources to the left/right of the dotted line (with con- 
centration = 0.145) are classified as stars/galaxies in SDSS. 



Figure 16. Y and J band ellipticities of sources for which both 
classifiers disagree; the dotted line is the main diagonal 




♦ not obv. extended 
■ saturated 



-10 10 20 30 40 50 60 

merged ClassStat 

Figure 15. The full sample of UKIDSS sources (grey points) 
with inconsistently classified sources highlighted (red). These are 
sources with 16 < average SDSS magnitude < 21.5 which have 
either P s 0.9 and c SD SS > 0.2 or P s ^ 0.1 and c SD SS < 0.5. 
Most are faint enough that some chance of incorrect classification 
is expected on statistical grounds; an explanation for the brighter 
sources was sought via visual inspection, the results of which are 
indicated. 



We have visually checked the sources for which the clas- 
sifiers disagree. Most are either blended pairs of stars (usu- 
ally in UKIDSS) or affected by diffraction spikes (in either 
survey). These sources have been included in Fig. 1151 and 
their type is indicated. Sources with large (> 15) ClassStat 
values are all either blended binary stars or affected by 
diffraction spikes of a nearby bright star. 

Figure [16] shows the ellipticities of the misclassified 
sources, as measured in UKIDSS and SDSS. In most cases 
the two measurements are consistent, but for five sources 
the estimated ellipticities disagree strongly. Most of the bi- 
nary stars undetected by UKIDSS, and sources affected by 
diffraction spikes, lie in the upper-right quadrant of the plot, 
indicating that UKIDSS indeed detected them as single, ex- 
tended objects. The five sources far off the diagonal have 
contradictory data in the different bands. Whether due to 
noise or inherent source properties, such data will confuse 
any classifier. 

Comparing our classifier and the UKIDSS pipeline to 
the Stripe 82 data, Figure [IT] shows the mismatch rates 
of both classifiers, taking the SDSS r-band classifications 
as a reference. To do this we have converted the poste- 
rior probabilities into class labels; an object is labelled as 
a star if P s ^ 0.5, otherwise as a galaxy. We have limited 
the sources to those with 16 < r < 20.5 so as to avoid 
saturated sources (r < 16) and sources for which the un- 
certainty of the SDSS labels is non- negligible (r > 20.5). It 
is clear that the Bayesian classifier is more accurate than 
the UKIDSS pipeline classifier; even though the difference 
in performance decreases for fainter magnitudes. For sources 
with 16.6 ^ Y < 17.4, our classifier achieves a mismatch rate 
of 0.0154, compared to 0.0314 for the UKIDSS pipeline. At 
the faint end (V > 20), the mismatch rates are 0.0679 (our 
classifier) and 0.0751 (UKIDSS pipeline). For all sources 
with 16 < r < 20.5, the mismatch rate for the UKIDSS 
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Figure 17. Mismatch rates between the SDSS r-band class labels 
and labels based on our classifier (red) and the UKIDSS pipeline 
(blue). Mismatch rates are shown both for all sources (with 16 < 
r < 20.5; solid lines) and for those sources for which our classifier 
outputs very definite classifications (P s < 0.1 or P B > 0.9; dashed 
lines). The magnitude values on the horizontal axis are the mid- 
range values of the bins used to compute the rates. Also shown 
are the standard errors of the mismatch rates. 



pipeline (0.0440) is more than double that of our classifier 
(0.0218). 



from our method, then a certain proportion of telescope time 
would be spent observing compact / faint galaxies that were 
misclassified. While there will certainly also be misclassified 
sources when selecting objects by basing the selection on P s , 
their proportion can be greatly reduced. 

Obviously there is a trade-off between completeness and 
efficiency when performing source selection. Table RT41 lists, 
for different fluxes, both completeness and efficiency (the 
fraction of the selected sources which are actually of the 
target class) for different methods of selecting faint stars, 
namely selecting sources with P s > 0.9 or P s > 0.5, us- 
ing the UKIDSS pipeline single-band or merged class labels, 
or selecting sources for which the UKIDSS pipeline posterior 
star probability exceeds 0.9. While the efficiencies of the dif- 
ferent methods are comparable for Y ~ 17 and Y ~ 18, our 
method leads to better completeness levels at these fluxes 
(both for using P s > 0.5 and P s > 0.9). For Y ~ 19 our 
method with P s > 0.9, using the UKIDSS merged class la- 
bels and using the UKIDSS pipeline posteriors perform iden- 
tically. Basing the selection on the UKIDSS F-band class 
labels or on the posteriors from our method with P s > 0.5 
leads to higher completeness but lower efficiency (but our 
method yields a much higher completeness than using the 
UKIDSS Y-band labels and also a marginally larger effi- 
ciency). Real differences can, however, be seen at Y — 20: 
while our method with P s > 0.9 has a much lower complete- 
ness level than the UKIDSS pipeline based methods, it also 
achieves a much higher efficiency. If telescope time is limited 
and completeness not important, then basing source selec- 
tion on P s can lead to a considerable reduction in 'wasted' 
observation time. Using our method with P s > 0.5 leads 
to completeness and efficiency levels more in line with the 
UKIDSS pipeline based methods. 



6.4 Value of the Bayesian method 

The good performance of both the UKIDSS pipeline classi- 
fier and our method over the entire sample is unsurprising, 
as most sources are detected with a sufficient signal-to-noise 
ratio that they can be classified without effort. However it 
is often the case that that the most important sources scien- 
tifically are those close to any new survey's detection limit 
- these objects would not have been detected by shallower 
surveys in the same band(s) and inevitably dominate the 
new discoveries from a given data-set. Hence the inclusion 
of prior information in the Bayesian classifier is most im- 
portant for just these sources where it results in significant 
numbers of more accurate classifications. 

Our method provides realistic estimates of the classifi- 
cation uncertainties and allows users, by setting constraints 
on the posterior classification probabilities P s , to specify the 
completeness (the fraction of target class sources that have 
actually been selected) and contamination (the fraction of 
the selected sources which are not from the target class) of a 
given selection before starting observations. Thus users can 
design the selection to suit the survey's aims. 

A practical application of our method would be to look 
at the amount of telescope time that would be required to 
follow-up a morphologically-selected sample of targets, ff 
one imagines a spectroscopic survey of faint stars, and one 
was to trust star-galaxy separators such as the ones used 
by UKIDSS or SDSS versus selecting sources with P s > 0.9 



7 CONCLUSIONS 

We have developed a Bayesian formalism for star-galaxy 
classification in optical or NIR surveys that combines the 
morphological properties of an object (as measured in mul- 
tiple passbands) with prior knowledge of the star and galaxy 
populations. A fully Bayesian approach must also include 
colour information for self-consistency; but, given the aim of 
combining morphological information correctly, a number of 
approximations were developed to maximize the influence of 
the morphological information. 

We have demonstrated our method on data from the 
UKfDSS LAS, combining morphology statistics measured 
in the Y, J, H and K bands (or whatever subset of these 
a source w as detected in). T he morphology statistic used, 
ClassStat (|lrwin et al.l|20 10D, represents a powerful means 
of data compression from the full image, and contains almost 
all the useful information for the faint sources (which are 
the main motivation for the development of sophisticated 
star-galaxy classification techniques). However, the existing 
UKfDSS data products include only heuristic combinations 
of the band-specific classifications, and the application of the 
Bayesian method developed here makes it possible to extract 
all the useful UKIDSS information on a source's morphology 
in as self-consistent a manner as is possible without using 
colour information as well. In particular, the use of prior 
information avoids the overly-confident classification of faint 
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Table 3. Completeness (comp.) and efficiency (eff.) for different selection methods at different fluxes. The SDSS Stripe 82 class 
labels have been taken as reference. 





16.6 < Y < 17 A 


17.6 < Y < 18.4 


18.6 < Y < 19.4 


19.6 < Y < 20.4 




comp. 


eff. 


comp. 


eff. 


comp. 


eff. 


comp. 


eff. 


our method with P s > 0.9 


0.980 


0.996 


0.968 


0.993 


0.785 


0.971 


0.103 


0.866 


our method with P B > 0.5 


0.984 


0.996 


0.980 


0.992 


0.916 


0.956 


0.540 


0.803 


UKIDSS y-band class star label = — f (stars) 


0.954 


0.997 


0.900 


0.993 


0.794 


0.945 


0.675 


0.652 


UKIDSS merged class star label = — f (stars) 


0.964 


0.997 


0.922 


0.993 


0.782 


0.972 


0.626 


0.799 


UKIDSS pipeline posterior star probability > 0.9 


0.963 


0.997 


0.921 


0.993 


0.782 


0.972 


0.626 


0.799 



sources, for which the available measurements contain little 
morphological information. 

Our test sample of UKIDSS LAS sources was chosen to 
lie in the multiply-scanned SDSS Stripe 82 region, giving us 
independent and almost totally reliable classifications of all 
our sources. (This is a rare situation outside simulations, and 
an opportunity that could be used for a number of similar 
testing schemes.) Converting the posterior probabilities into 
class labels, the Bayesian classifier achieves an error rate 
of 0.068 at the UKIDSS detection limit, compared to 0.075 
for the UKIDSS pipeline. For all non-saturated sources, the 
error rate for our model lies at 0.022, compared to 0.044 for 
the UKIDSS pipeline. 

The Bayesian model used to separate stars and galax- 
ies described here can be very easily applied to other sur- 
veys with similar statistics measuring the extendedness of 
sources. The multiple advantages of such a classifier (pos- 
terior probabilities, use of prior knowledge, rigorous com- 
putation of multi-band classifications, ability to cope with 
missing detections) and its good performance exhibited for 
the UKIDSS data provide a strong argument in favour of a 
wider use of this methodology. In particular the use of our 
method can improve the efficiency of telescope time. 
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